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0 .  Executive  Summary 

In  this  paper  various  statistical  models  and  techniques 
are  employed  to  forecast  the  existence  ot  low-level  stratus 
conditions.  They  are  illustrated  for  data  at  an  airport 
(Moffett  Field,  Sunnyvale,  California). 

In  Section  2  the  data  set  is  described  and  the  results 
of  a  preliminary  exploratory  data  analysis  are  given.  These 
suggest  that  dew  point  depression  should  be  predictive  of  the 
existence  of  stratus.  Generally,  low  (high)  dew  point  depres¬ 
sion  is  associated  with  the  existence  (non-existence)  of  stra¬ 
tus.  This  association  is  also  made  evident  by  a  spectral 
analysis  of  hourly  stratus  levels  and  dewpoint  depression 
described  in  Appendix  F. 

The  remainder  of  this  paper  describes  procedures  for  and 
results  of,  fitting  logistic  models  to  the  data  described. 
Validation  of  the  models  are  addressed  as  well.  The  basic 
logistic  model  is 


P{ Y  =  1 j explanatory  variable  x.} 


exp{x  ej 
1  +  exp{x,  ej 


where  >c  is  a  p-vector  (row)  of  explanatory  variables  and  j; 
is  a  p-vector  (column)  of  coefficients  to  be  determined. 


Appendix  E  suggests  several  mathematical  justifications  tor 
use  of  the  logistic  regression  model. 

We  have  used  various  methods  to  tit  logistic  models  tor 
use  as  predictors  on  reserved  data  sets  (cross-validation). 

Our  cross-validation  experiences  are  reported  in  Appendices  a 
through  D.  Appendix  G  contains  the  asymptotic  distribution  ot 
a  threat  score ,  which  is  one  of  the  statistics  we  use  to  compare 
procedures . 

Appendix  A  reports  on  use  ot  the  stepwise  logistic  regres¬ 
sion  procedure  of  the  BMDP  computer  package.  The  procedure 
chooses  variables  to  be  used  in  the  regression  from  a  menu  of 
variables  given  to  it.  The  BMDP  fits  are  then  used  to  predict 
the  occurrence  ot  stratus  tor  independent  data,  i.e.  from  dif¬ 
ferent  years.  We  find  that  the  stepwise  feature  must  be  used 
with  caution;  it  tends  to  overtit,  including  variables  which 
greatly  increase  the  standard  error  ot  the  variables  first 
included  in  the  regression.  Buch  overtitting  deyrades  the 
predictive  powers  ot  the  model. 

Copas  ( lybi)  points  out  that  a  regression  model,  tit  by 
maximum  likelihood  (or  least  squares)  to  one  set  ot  data,  and 
then  used  tor  prediction  on  another  set  ot  data,  nearly  always 
tits  or  predicts  the  new  set  ot  data  less  well  than  it  does  the 
original  set.  This  phenomenon  of  shrinkage  can  become  more  pro¬ 
nounced  if  the  original  regression  model  is  tit  usiny  a  step¬ 
wise  procedure,  which  tends  to  overtit.  Appendix  b  describes 
and  investigates  a  procedure  suggested  by  Copas  to  compensate 
for  shrinkage  on  both  regressions  tit  with,  and  without,  stepwise 


procedures.  In  our  application,  particularly  when  predicting 
changes  from  stratus  to  no  stratus,  the  shrinkage  procedure 
appears  to  help.  However,  it  appears  to  do  less  well  in  predict¬ 
ing  changes  from  no  stratus  to  stratus. 

In  Appendix  C  robust  estimation  procedures  for  logistic 
regression  are  described  and  carried  out  on  the  Moffett  Field 
data.  These  procedures  are  less  vulnerable  than  maximum  like¬ 
lihood  estimates  to  a  few  outlying  data  points  which  may  not 
agree  with  the  model.  For  the  particular  cross  validations 
performed,  predictions  using  models  fit  with  robust  procedures 
were  no  better  than  predictions  made  with  the  models  fit  with 
maximum  likelihood.  The  models  obtained  are,  however,  system¬ 
atically  different  from  their  classical  counterparts. 

In  Appendix  D  we  investigate  the  predictive  use  of  lo¬ 
gistic  regression  models  that  are  progressively  updated  to  em¬ 
phasize  recent  data.  The  suggestion  is  that  models  fit  with 
data  which  are  closer  in  time  to  the  dates  on  which  forecasts 
are  to  be  made  may  be  more  relevant,  owing  to  changing  condi¬ 
tions  not  represented  in  the  model,  than  a  model  which  is  fit 
with  data  of  several  previous  years.  We  found  that  models 
with  updating  often  did  at  least  as  well  as  models  without  an 
updating  feature. 

In  summary,  we  have  found  that  £n(dew  point  depression  +  1) 
appears  to  be  a  consistently  useful  predictor  of  the  occurrence 
of  stratus.  Low  (high)  dew  point  depression  is  associated  with 
stratus  (no  stratus).  There  is  no  one  procedure  or  model,  among 
those  tried  to  date,  that  appears  a  clear  winner.  If,  for 


example,  one  procedure  does  well  in  predicting  changes  from  no 
stratus  to  stratus,  it  will  often  do  less  well  in  predicting 
changes  from  stratus  to  no  stratus.  we  found  that  none  of  the 
procedures  did  as  well  predicting  the  occurrence  of  stratus  in 
1962  as  it  did  in  1961.  This  suggests  tnat  perhaps  1962  is  not 
described  by  the  present  models  as  well  as  is  1961,  being  in¬ 
trinsically  quite  different  from  the  previous  years  195«-ol. 
Models  and  methods  that  represent  year-to-year  differences  will 
come  under  investigation  in  future. 

Further  work,  with  other  models,  and  with  data  from  othe 
locations,  will  be  undertaken  to  shed  light  on  this  important 
prediction  problem. 


Introduction  and  Overview 


The  purpose  of  this  paper  is  to  exhibit  the  use  of  sta¬ 
tistical  tools  and  procedures  for  forecasting  the  existence  of 
low-level  stratus  conditions  at  an  airport.  The  existence  of 
low  stratus  (less  than  or  equal  to  1UUU  ft.)  forces  the  use  of 
different  methods  of  traffic  handling  than  is  the  case  when 
higher  stratus  levels  prevail.  A  low  stratus  condition  tends 
to  inhibit  flight  operations,  so  it  is  desirable  to  forecast 
its  occurrence.  Furthermore,  it  is  of  interest  to  forecast 
such  conditions  on  a  "single-station"  Dasis,  making  use  of  me¬ 
teorological  measurements  available  only  at  the  location — e.g. 
airport — in  question,  in  case  useful  supplementary  information 
is  unavailable. 

The -forecasting  approaches  described  here  are  statisti¬ 
cal  in  nature,  meaning  that  extensive  data  concerning  the  re¬ 
ported  hourly  stratus  level  at  an  airport  (Moffett  Field, 
Sunnyvale,  CA),  together  with  certain  other  meteorological 
measurements  or  parameters  recorded  and  reported  at  that  loca¬ 
tion,  were  used  as  raw  material  tor  the  forecasts.  These  data 
were. used  to  estimate  the  probability  of  low  stratus  during  a 
daily  period;  the  latter  probability  was  estimated  using  a 
logistic  regression  model,  a  tool  that  has  been  found  useful  in 
biological  and  medical  statistics,  and  that  has  been  previously 
applied  in  meteorology;  cf  Brelsford  and  Jones  (1967), 

Gilnausen  (1979),  Gabriel  and  Pun  (1979).  In  a  later  section 


we  present  various  derivations  or  justifications  of  such  a 
model.  Alternative  models  are  also  suggested,  and  the 
usefulness  of  these  will  be  investigated  in  future  work. 


The  usefulness  of  the  logistic  (or  any  other)  model  must 
be  judged  by  its  performance.  We  have  chosen  to  proceed  by  (i) 
fitting  a  model  to  data  for  certain  specific  years  (1968-1960), 
and  then  (ii)  comparing  the  model  predictions  to  actual  occur¬ 
rences  for  a  completely  different  period  (1961,  1962).  such  a 
procedure  is  termed  cross  validation;  see  Mosteller  t  nd  Tukey 
(1977)  for  good  general  discussion  and  references.  The  results 
of  our  cross  validation  are  reported  suDsequently .  Another  in¬ 
teresting  and  possibly  useful  approach  is  to  construct  and  test 
an  adaptive,  automatically  up-dating  forecasting  modei  with 
characteristics  similar  to  "exponential  smoothing”  or  "Kalman 
filtering."  Results  of  some  simple  updating  procedures  tor 
forecasting  will  also  be  reported. 

Successful  forecasting  with  the  aid  of  a  model  requires 
that  the  data  inputs  be  relatively  "clean,"  or  in  basic  con¬ 
formity  with  the  model.  Occasionally  occurring  data  points 
that  are  out  of  line  for  any  reason,  called  outliers ,  or 
influential  values,  can  radically  change  the  values  of  the 
model  parameters  obtained  from  statistical  fitting  principles 
such. as  least  squares  (not  used  for  fitting  our  logistic  model) 
or  maximum  likelihood  (wnich  is  used).  To  check  tor  such 
maverick,  possibly  detrimentally  influential,  values  it  is 
possible  to  proceed  in  several  ways.  One  is  to  successively 
remove  each  data  point  (actually  a  vector  of  response  and 
explanatory  variables)  and  re-fit  the  model,  watching  for 
radical  changes  in  fitted  model  parameters.  This  method  has 
been  programmed  (in  APL,  on  the  NPS  IBM  3033  system)  and 
exercised;  its  defect  is  that  at  present  just  one  data  point 


is  removed  at  a  time,  so  if  several  points  are  mavericks  this 
fact  may  be  overlooked.  Clever  ways  of  automatically  diminish¬ 
ing  the  effects  of  maverick  points  have  been  discussed  by 
Pregibon  (1982);  exploration  of  the  applicability  of  such  ideas 
to  the  present  stratus  prediction  problem  is  currently  underway. 
The  methods  and  some  results  are  reported  here. 

Another  approach  to  the  identification  of  maverick  data, 
and  to  the  possible  discovery  of  an  appropriate  model,  is  by 
computer  graphics.  We  have  initiated  the  examination  of  tne 
low-stratus  data  on  a  pioneering  graphics  facility  at  Stanford 
Linear  Accelerator  (SLaC);  see  an  article  in  science ,  Kolata 
(1982),  for  general  description.  The  SLAC  system  allows  an 
analyst  views  of  various  three-dimensional  space  projections  of 
multidimensional  data-clouds.  Such  examination  helps  to  reveal 
the  association  between  certain  explanatory  ("independent") 
variables  and  the  response  ("dependent  variable")  of  interest. 
For  example,  examination  of  our  stratus  data  indicated  that 
changes  in  the  explanatory  variable  dewpoint  depression  tended 
to  be  reflected  in  changes  of  response,  i.e.  low  stratus  level 
probability.  This  association  has  physical  basis,  and  dewpoint 
depression  had  actually  been  included  in  earlier  exploratory 
logistic  fits  at  the  suggestion  of  W.  Sweet  of  NLPKF;  its 
incorporation  into  the  model  considerably  improves  predictive 
performance . 


2 


The  basic  Data  set 


The  statisticai  methods  used  in  this  study  were  applied 
to  data  furnished  t>y  W.  Sweet  of  NFFkF,  to  whom  we  are  grateful. 
In  summary,  these  data  consist  of  reported  hourly  determinations 
of : 

(i)  stratus  level,  reported  to  be  at  discrete  levels  of 
100  ft.  separation;  possible  recorded  levels  are 
k  x  100  ft.,  k  =  1 , 2 , . . . , 9 , 10 , . . . , "999"  (no  visible 
stratus ) . 


(ii)  east-west  wind  velocity, 
per  hour; 


V  ,  at  surface,  in  miles 
x 


(iii)  north-south  wind  velocity,  V  ,  at  surtace,  in  miles 
per  hour;  ^ 

(iv)  temperature,  at  surtace,  in  degrees  F ; 

(v)  dewpoint,  at  surface,  in  degrees  F; 
all  at  Moffett  Field,  California,  for  the  months  of  July, 

August,  and  September  of  the  years  19i>b-1962;  later  data  are 
also  available,  and  remain  to  be  analyzed.  Although  other 
measurements,  e.g.  of  pressure,  are  in  principle  available,  they 
were  not  utilized  in  the  present  analysis.  Nor  were  measure¬ 
ments  from  neighboring  locations  in  the  San  Francisco  bay  area. 
2.1.  The  Forecasting  Exercise  Data  Set 

The  raw  data  described  above  were  adopted  to  the  fore¬ 
casting  exercises  as  follows: 

(a)  Forecasts  are  made  of  the  existence  of  stratus 
level  less  than  1000  ft.  (£  90U  ft.)  on  any  hour  between 
10:00  pm  (22U0)  on  day  t  ,  and  6:00  am  (06U0)  on  day 
t  +  1 .  If  hourly-reported  stratus  level  ever  fell  to  a 
level  £  900  ft.  during  such  a  period  beginning  on  day  t, 
it  is  agreed  to  say  that  stratus  existed  on  day  t;  otherwise 


that  no  stratus  existed  on  day  t.  Denote  by  the  binary 


indicator  variable  yfc  the  existence  (non-existence)  of 


stratus  on  day  t  according  to  the  above  definition.  Thus 


1  if  stratus  exists  on  day  tr 


yt  = 


0  if  no  stratus  exists  on  day  t. 


Call  yfc  the  response  (or  dependent  variable)  when  forecasting 


for  day  t.  Note  that  the  observed  values  of  response  on  pre- 


rious  days  (yt_^,yt_2, • • • )  are  available  as  assistance 


when 


forecasting  for  day  t.  The  above  definition  of  meaningful 


stratus  agrees  with  instrument/no  instrument  landing  rules  at 


airports,  and  is  thus  of  operational  significance. 


Candidate  explanatory  (independent)  variables  are  these: 


(b)  wind  velocities  at  6:00  pm  (1800)  on  day  t  , 


items  (ii)  and  (iii)  above; 


(c)  temperature  (Tfc)  and  dewpoint  (Dfc)  at  surface  at 
6:00  pm  on  day  t? 


(d)  dewpoint  depression,  =  T^  -  Dfc  at  6:00  pm 


on  day  t; 


(e)  hours  of  stratus  (Hfc_^)  between  2200  on  the  previous 


day  t-1  and  0600  on  the  current  day  t  ; 


(f)  existence/non-existence  of  stratus  (y ,yt_2, • • • ) 


on  previous  days, 


Let  NSfc  denote  the  number  of  consecutive  days  of  stratus  in 


a  run  of  stratus  days  that  includes  day  t-1,  the  day  on  which 


the  prediction  is  made.  NNSfc  is  the  number  of  consecutive  days 


of  no-stratus  in  a  run  of  no-stratus  days  that  includes  day 


t-1. 


Note  that  because  of  the  way  in  which  the  response  y 
is  defined,  it  is  legitimate  and  of  interest  to  forecast  y  in 
terms  of  Tfc,  Dfc ,  Vx(t),  etc.  These  latter  quantities  are 

all  available  at  6:00  pm  for  forecasts  applying  later,  i.e.  fro 
10:00  pm  to  6:00  am  on  the  following  day.  Of  course  many  other 
functions  of  the  hourly  observations  are  candidates  for 
explanatory  variable  status. 


3.  Preliminary  Ana  lysis 


Before  proceeding  to  the  fitting  of  specific  models,  a 
subset  of  the  data  has  been  examined  in  terms  of  simple  summa¬ 
ries.  Since  the  objective  is  to  forecast,  we  have  divided  (con- 


ditioned) 

the 

data  for  the  years 

1958, 

1959, 

1960 

into 

f  ou 

groups : 

Group 

00: 

observations 

such 

that 

yt-l 

=  o. 

yt  = 

0  , 

Group 

01: 

observations 

such 

that 

yt-l 

=  o. 

yt  = 

1  , 

Group 

10: 

observations 

such 

that 

yt-l 

=  1, 

yt  = 

0  , 

Group 

11: 

observations 

such 

that 

yt-l 

=  1, 

yt  = 

1  , 

and  have  then  computed  summaries  of  the  observed  distributions 
of  certain  candidate  explanatory  variables.  The  argument  is 
that  a  noticeable  separation  of  such  distributions  when  predict¬ 
ing  yt  from  the  particular  explanatory  data  suggests  that  the 
variable  in  question  may  be  useful  in  forecasting. 

Note  that  we  have  explicitly  used  the  known  stratus  state 
of  the  system  at  t-1  as  one  important  variable,  wishing  to 
make  full  use  of  persistence,  and  to  improve  upon  it.  We  are 
especially  interested  in  the  power  of  explanatory  variables  and 
their  combinations  to  correctly  forecast  changes  in  stratus 
conditions,  e.g.  from  yt_i  =  0  ( no  stratus  on  day  t.-l)  to 

yt  =  1  (stratus  on  day  t).  Simple  persistence  forecasting, 
which  predicts  y  =  Yt_^  will  never  identify  prospective 
changes . 

Computer  graphic  analysis  carried  out  at  SLAC,  plus 
physical  insight,  suggest  that  dewpoint  depression,  Afc,  should 
be  an  effective  explanatory  variable.  Another  useful  variable 


seems  to  be  the  hours  of  stratus  observed  on  day  t-1,  denoted 
by  •  There  are  limitless  other  plausible  explanatory 

variables,  as  well  as  combinations  and  re-expressions  (trans¬ 
formations)  of  the  latter,  but  here  we  look  at  only  two.  One 
systematic  way  of  uncovering  predictive  combinations  of  explana¬ 
tory  variables  is  by  use  of  some  form  of  principle  component  or 
factor  analysis;  such  work  is  not  reported  here.  It  seems  pos¬ 
sible  that  a  robust  principle  component  analysis  may  he  informa¬ 
tive  (see  Gnanadesikan  (19"7"7),  or  Campbell  (1982)),  for  the 
existence  of  groups  of  maverick-like  data  have  been  reported  in 
the  overall  data  base.  Clustering  procedures  may  also  be  of 
va lue . 

Tables  1  and  2  give  a  few  useful  summaries  of  the  behavior 
of  the  candidate  explanatory  variables  and  these 

have  been  developed  for  the  years  1958,  1959,  1960.  The  figures 
in  parentheses  are  natural  logarithms  of  their  counterparts. 

The  log  transformation  is  suggested  to  symmetrize  the  sample 
distribution  (histogram  or  Tukey  stem-leaf  plot),  which  often 
tends  to  appear  positively  skewed  for  the  above  data.  The 
medians  and  quartiles  are  used  instead  of  the  ordinary  means  and 
standard  deviations  because  of  the  possible  non-robust/res istant 
properties  of  the  latter  traditional  measures. 

We  can  draw  the  following  conclusions  from  Table  1: 

(a)  corresponding  summary  figures  for  dewpoint  depression 
(0,M,0)  are  rather  stable  from  year  to  year. 

(b)  dewpoint  depression  (or  its  log)  should  have  prognostic 
power;  roughly  speaking. 


TABLE  1 


Observed  Distribution  of  Dew  Point  Depression 


Year 

Lower  Quart ile 

Median 

Upper  Ouartile 

(Q) 

(M) 

(0) 

1958; 

1 

0: 

9(2.2) 

9(2.2) 

10(2.3) 

1 

1; 

6(1.  ->9) 

"Ml. 95) 

9(2.2) 

0 

1 : 

6(1. T9) 

8(2.08) 

8(2.08) 

0 

-► 

0; 

10(2.3) 

13(2.56) 

17(2.83) 

1959; 

1 

-► 

0: 

8(2.08) 

9(2.2) 

11(2.4) 

1 

-► 

Is 

6(1. 79) 

7(1.95) 

9(2.2) 

0 

Is 

7 ( 1 .95 ) 

9(2.2) 

10(2.3) 

0 

0: 

10(2.3) 

14(2.64) 

16 ( 2 . 77 ) 

1960; 

1 

-► 

0: 

7  ( 1 . 95 ) 

10(2.3) 

13(2.56) 

1 

Is 

6  (  1 . 79  ) 

8(2.08) 

9(2.2) 

0 

-► 

Is 

7  ( 1 . 95 ) 

8(2.08) 

lx ( 2 . 4  ) 

0 

0: 

10(2.3) 

13(2.56) 

16(2. 77) 
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if  stratus  is  present  at  time  (day)  t-1,  and  if 


is  relatively  high  (9  or  above),  a  change  to 


no  stratus  is  indicated,  while  if  Afc  is  rela¬ 


tively  low  (below  9)  the  stratus  condition  tends 


to  continue;  on  the  other  hand 


if  no  stratus  is  present  at  time  (day)  t-1,  and 


if  Afc  is  relatively  high  (10  or  above)  the  no 


stratus  condition  tends  to  continue,  while  if  A. 


is  relatively  low  (below  10)  changes  to  a  stratus 


condition  become  more  frequent. 


These  results  are  physically  plausible,  and  appear  con¬ 


sistently,  if  not  overwhelmingly  strongly,  in  the  present  data. 


Figures  1  and  2  show  box  plots  of  dew  point  depression 


and  *,n(dew'  point  depression  +  1)  for  the  years  1958-60  (cf.  Tukey 


andMosteller  (19'7'7)).  Each  of  the  four  plots  in  the  figures 


contain  only  those  points  for  which  y  ^  =  i  *  3  =  Yt  for 


i,j  =  0,1  .  The  top  (bottom)  edge  of  the  box  is  the  upper 


(lower)  quartile  of  the  data  set;  the  symbol  within  the  box  is 


at  the  median;  the  lines  connect  the  mean;  and  the  circles  out¬ 


side  the  boxes  represent  outlying  data  points. 


It  appears  from  the  top  two  plots  in  each  figure  that  dew 


point  depression,  Afc  ,  may  have  more  prognostic  value  if  there 


is  no  stratus  the  day  before.  If  there  is  no  stratus  the  day 


before,  then  high  Afc  appears  to  be  associated  with  persistence 


of  no  stratus.  Since  the  box  plots  do  overlap,  it  is  clear  that 


A  will  not  provide  perfect  prediction. 


I  CURE 


LN(TEMP-DEW) 

BOX  PLOTS  FOR  LN(TEMP-DEW)  FOR  0-0 


FIGURE 


An  exploratory  spectral  analysis  of  hourly  infstratus 


height)  and  £n(dew  point  depression  +  1)  for  195b  described  in 
Appendix  F  also  suggests  that  high  (low)  dew  point  depression 
is  associated  with  high  (low)  stratus  height. 

In  Table  2  are  corresponding  figures  for  hours  of  stratus 
on  previous  days. 

TABLE  2 

Observed  Distribution  of  Previous  Days'  Hours  of  Stratus 


Year 

Lower  Quart ile 

(0) 

Median 

(M) 

Upper  Quart ile 

(0) 

1958; 

1-0: 

2(0.7) 

4(1.4) 

8(2.1) 

1-1: 

6(1.8) 

7(1.9) 

8(2.1 ) 

1959; 

1'-  0: 

3(1.1) 

4(1.4) 

4(1.4) 

1  -  1: 

4(1.4) 

6(1.8) 

8(2.1) 

1960; 

1  -  0: 

3. 0(1.1) 

4(1.4) 

4(1.4) 

1-1: 

4(1.4) 

6(1.8) 

8(2.1) 

Again 

the  figures 

in  parentheses  are 

logs . 

Again  some 

indications  from 

the  table 

are  of  interest: 

(a)  corresponding  summary  figures  are  rather  stable,  but 

somewhat  less  so  than  for  , 

(b)  relatively  low  values  of  previous  days'  hours  of  stratus 
tend  to  be  associated  with  change  to  no-stratus  condi¬ 
tion,  but  the  tendency  is  rather  weak. 

The  tendency  noticed  above  may  possibly  be  accounted  for  by  the 

fact  that  an  underlying  weather  system  is  passing  over  the 

Moffett  area.  Towards  the  end  of  its  sojourn  there  the  hours 

of  resulting  stratus  tend  to  gradually  decrease  to  zero. 


Box  plots  for  the  number  of  hours  of  stratus  the  day 
before  when  there  is  stratus,  for  years  1958-60  appear  in  Figure 
3.  bach  figure  contains  only  those  points  for  which  the  current 
day  has  no  stratus  or  stratus  respectively.  There  appears  to  be 
an  association  between  a  high  number  of  hours  of  stratus  the  day 
before  and  persistence  of  stratus.  The  association  does  not 
appear  strong,  however. 

Although  the  above  sort  of  analysis  is  interesting,  it 

tails  to  incorporate  the  joint — possibly  interactive — effects 

of  several  variables.  Note  that  no  such  analysis  is  reported 

here  for  the  other  possible  explanatory  variables  related  to 

surface  wind,  namely  V  and  V  .  Somewhat  surprisingly, 

x  y 

these  have  been  found  to  have  secondary  value  for  the  location 
and  years  under  investigation. 
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In  Appendix  H  we  yive  several  mathematical  justit  ications 
tor  use  ot  the  logistic  regression  model.  In  the  present  Appen¬ 
dix  results  are  given  ot  tittiny  various  logistic  models  to 
available  Mottett  field  data  tor  years  l9bb-bl);  they  are  cross- 
validated  tor  years  1961  and  1962. 

Here  the  term  mode  1  refers  to  the  basic  logistic 
representation 

Wy3!  I  explanatory  variables  x}  =  I^^'xpjxg  }  (A-l) 

where  x  is  a  p-vector  (row)  ot  explanatory  variables,  and  £ 
is  a  p-vector  (column)  of  coefficients  to  be  determined.  The 
BMDP  package  performs  the  fitting,  i.e.  determination  ot  jj 
from  observations,  by  maximum  likelihood  or  a  closely  related 
method.  It  also  furnishes  Student  t-values  for  assessing  the 
statistical  significance  of  the  coefficients  determined,  and 
has  a  step-wise  facility,  which  enters  explanatory  variables  in 
accordance  with  their  judged  explanatory  value.  The  above  pro¬ 
cedure  assumes  that  the  model  is  appropriate  tor  the  data,  a 
practice  that  may  be  danyerous  in  observational  studies,  as  has 
been  pointed  out  by  Preyibon  (1962),  who  suyyests  some  remedies. 
An  examination  ot  remedies  tor  dealiny  with  possibly  "ili- 
fittiny"  data  by  the  logistic  is  currently  in  proyress,  and  will 
be  applied  to  the  Moffett  Field,  and  other,  data. 

In  the  exercises  reported,  we  have  fitted  19bb-l96U  data 
by  logistic  models  using  the  variable  selection  feature.  Two 
types  of  tits  are  considered.  In  one  type  we  condition  on  the 


previous  day's  stratus  state;  in  other  words  p^fx.)  means  the 
probability  of  stratus  on  day  t  ,  given  no  stratus  on  t-1  and 
the  influence  of  explanatory  variables  x.  *  ( x.)  means  the 

probability  of  stratus,  given  stratus  on  t-1.  In  the  other  type 
we  have  fit  all  the  data  at  once,  using  an  indicator  variable  to 
identify  stratus  -  no  stratus  days. 

The  predictions  made  are  categorical:  i.e.  if  the  calcu¬ 
lated  p-value  exceeds  0.5,  stratus  is  predicted,  while  if  below, 
no  stratus  is  predicted.  We  have  cross-validated  predictions 
against  the  years  1961  and  1962. 

Model  A-l:  Prediction,  given  no  stratus  the  previous  day 
(y  ^  =  0).  The  explanatory  variables  selected  are:  a  constant, 
An  (Afc  +1),  Vy  .  The  fit  is  as  follows  with  standard  errors  of 
the  fitted  parameters  in  parentheses  below: 

X0  =  6.63  -  3.65  ln(A  >  -  0.08-78  V 
—  t  y 

(  1.70)  (0.-741)  (0.0495) 

where  A  «  Afc  +  1  . 

The  cross  validations  results  for  1961-19b2  (F  means  Forecast, 

A  means  Actual)  are  below. 

1961-1962 


Fraction  Correct 


88  4-  6 

88+6+7+ie 


.80 


Note  that  simple  persistence  forecasting  ("tomorrow  is 
the  same  as  today")  for  both  1961  and  1962  gives  a  fraction  of 
correct  forecasts  equal  to  0.83  ( ( 88+7 )/( 88+7+16+4 ) ) ,  which  is 
actually  slightly  better  than  the  logistic  forecast.  However,  the 
present  logistic  model  does  correctly  forecast  about  one-quarter 
to  one-third  of  the  changes  from  no  stratus  to  stratus  correctly; 
persistence  will  never  correctly  forecast  a  change. 


Model  A-2:  Prediction  given  stratus  the  previous  day  (y  ^  =  1). 

The  variables  selected  were  £n(At  +  1)  and  •  The  fit  is 

x  &  =  6.12  -  3.34  *n(A  )  +  0.30  H 

( 2 . 07 )  (0.940)  t  (0.0893)t_ 

where  A  t  =  A  +  1  . 

The  numbers  in  parentheses  beneath  the  coefficients  are  standard 
errors  based  upon  the  assumption  of  a  correct  model,  maximum- 
likelihood  fitted. 

The  cross  validation  results  are  below. 


1961-1962 


A  \ 

F  0 

1 

Fract ion 
Correct 

0 

16 

6 

.  73 

1 

14 

29 

.  67 

Correct 

=  0.69  = 

16  + 

14  +  29  + 

29 

16  +  6 

In  this  case  the  logistic  model  did  as  well  as  persistence  (0.66) 
in  predicting  stratus  and  no  stratus.  Furthermore,  it  predicted 
73%  of  the  changes  correctly. 
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The  results  of  the  validation  for  1961-1962  of  the  two 
fits  in  Exercises  A-l  and  A-2  are  combined  in  the  following  table 

Prediction 


Success 

failure 

Fraction  Correct 

0  +  0 

88 

0.93 

0+1 

6 

16 

0.27 

1  +  0 

16 

6 

0.73 

1  +  1 

29 

14 

0 . 67 

The  threat  score  for  predicting  changes  from  0  +  1  is 


'0  +  1 


T  =  _ '  -  =  "  =  o  21 

0  N0+l  +  F0+0  (  6+16+7 )  °-21 


(A-2) 


where 

Cn  ,  =  the  number  of  correct  predictions  of  change  from 
0  +  1, 

Nq+1  =  the  total  actual  number  of  changes  from  0+1, 

Fn  =  the  number  of  incorrect  predictions  of  no  change 

0  +  0. 

Similarly  the  threat  score  for  predicting  changes  from 
1  +  0  is 


T  =  1*0 
1  N, .„+F, 


16 


1+0  L 1+1 


16+6+14 


=  0.44  . 


( A-3 ) 


The  threat  score  for  predicting  all  changes 


TT  = 


c  +  c 

^0+1  1+0 


N0+l+  Nl+0+  F0+0+Fl+l 
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The  fraction  of  correct  predictions  using  the  logistic 


models  is 


88+6+16+29 

182 


0.-76. 


The  fraction  of  correct  predictions  using  persistence  is 


88  +  7  +  29  +  14 
182 


=  0.76, 


Of  course  persistence  predictions  will  never  be  correct  when  a 
change  takes  place,  while  the  methods  just  presented,  and  others, 
may  actually  do  quite  well  and  seem  worth  the  extra  trouble. 

Model  A— 3 ;  Prediction  based  on  all  data.  The  variables  selected 
ares  a  constant,  fcn(Afc  +  1),  and  Ht_1.  The  fit  is 

x  £  =  6.73  -  3.39  £n(Afc  +  D  +  .225  H  x 

(1.30)  (0.570)  (0.0507) 

where  A  =  Afc  +  1  . 

Again  numbers  in  parenthesis  are  standard  errors. 

The  cross-va 1 ida t ion  results  are  below: 


0+0 

0+1 


1961-1 962 
Prediction 

Success  Failure  Fraction  Correct 

91  4  0.96 

5  17  0.23 


0+0 


16 


6 


0.73 


Fraction  Correct 


91  +  5  +  16  +  30 
182 


0.7b 


SI 


n 


Fraction  Correct  Persistence  = 


yi  +  4  +  30  +  13 


=  0.7b, 


The  threat  scores  for  predicting  changes  are 


T0  5+17+4 


=  0.19 


T1  16+6+13  '  U*46 


TT  = 


16  +  5 

16+5+17+6+4+13 


=  0.34 


The  threat  scores  for  the  fit  using  all  the  data  are  about 


the  same  as  those  for  the  separate  fits,  i.e.  those  that  condition 


on  whether-  or  not  stratus  existed  on  the  aay  before.  The  fraction 


of  correct  predictions  of  stratus  and  no  stratus  is  also  about  the 


same  as  that  for  the  two  separate  fits,  and  tor  prediction  by 


persistence.  We  conclude  that  doing  separate  fits  based  on 


whether  or  not  there  is  stratus  the  day  before  may  not  be 


profitable;  a  single  logistic  model  may  do  as  well  as  two. 


APPENDIX  B 


Shrinkage 

The  term  "shrinkage"  is  used  in  connection  with  the 
following  phenomenon:  a  regression  model  fit  by  maximum  likeli¬ 
hood  (or  least  squares)  to  one  set  of  data  which  is  then  used 
for  prediction  on  another  set  of  data  nearly  always  fits  the 
new  set  of  data  less  well  tnan  it  does  the  original  set.  Copas 
(1982)  points  out  that  shrinkage  can  be  more  pronounced  if  the 
original  regression  fit  is  made  with  the  aid  of  a  stepwise  pro¬ 
cedure;  the  latter  tends  to  overfit.  He  suggests  using  the 
following  logistic  model  tor  binary  prediction: 

P{Y  =  1 1  explanatory  variable  x.) 

n  • 

exp(e'+K  l  3! (x  -  x  )} 
u  i =1  1  1  1 

=  - * — n — ; - : - 

1  +  exp{ e . +K  l  b! (x.-  x. )} 

i_l  i 

—  a.  L. 

where  x^  is  the  mean  of  the  i  explanatory  variable  tor  the 
original  data.  {b|}  are  the  MLE  estimators  for  the  original 
data  and  K  is  a  shrinkage  parameter;  K  =  1  means  that  there 
is  no  shrinkage.  Data-derived  prescriptions  can  be  found  for 
K,  but  in  the  exploratory  work  reported  here  we  have  found  several 
numerical  trial  values  and  taken  note  of  their  general  effects. 

The  stepwise  regression  procedure  of  BMDP  was  used  to 
fit  a  logistic  model  to  data  from  1958-6U.  This  model  with,  and 
without  shrinkage  was  then  used  to  predict  the  occurrence  of 
stratus  in  the  years  1961-62.  Tables  J  and  4  give  the  results 


of  the  cross  validation.  Note  that  shrinkage  slightly  improves 
the  prediction  of  no  stratus  on  the  following  day. 

Tables  5  and  6  give  the  results  of  fitting  logistic  models 
to  the  data  from  1958-60  and  using  the  models  with  and  without, 
shrinkage  to  predict  stratus  in  1961.  Four  different  models  were 
used.  The  parameters  of  the  models  are  as  follows 

Parameters 

A  constant,  l n  At,  yt-1 

AW  constant,  Jin  A  ,  y.  .,  V  ,  V 
Model  y 

B  constant,  £n  Afc,  NSfc,  NNS^ ,  H  ^ 

BW  constant,  in  Afc,  NS^ ,  NNSfc,  H^_^,  V^, 

where  Afc  is  the  dew  point  depression  plus  1. 

The  models  were  fit  using  maximum  likelihood.  Stratus  was 
predicted  on  day  t  if  the  forecast  probability  of  stratus  was 

greater  than  or  equal  to  a  .  The  cutoff  point  a  was  taken  to 

be  0.5,  or  alternatively  0.41,  the  fraction  of  days  of  stratus 
during  the  years  1958-1960. 

Tables  7  and  8  give  similar  results  for  the  models  fit  to 
data  in  1958-61  and  validated  on  1962  data.  The  cutoff  point  a 
was  taken  to  be  0.5,  or  alternatively  0.37,  the  fraction  of  days 
of  stratus  during  the  years  1958-61. 

Tables  9  and  10  give  the  threat  scores  for  the  prediction 
of  changes  (equations  A-2,  A-3,  and  A-4 ) . 

The  simplest  model  A  with  a  cutoff  of  0.5  seemed  to  do  as 
well  as  any  of  the  more  complicated  models.  The  rise  of  the 
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historical  fraction  of  stratus  days  sometimes  improved  prediction 
of  changes,  but  not  in  all  cases.  The  use  of  shrinkage  once  again 
often  seemed  to  improve  prediction  of  changes  from  stratus  to  no 
stratus  but  again  not  uniformly.  Models  A  and  B  with  no 
shrinkage  did  as  well  as  the  stepwise  BMDP  procedure  with  no 


shrinkage 


Table  3 


Validation  on  61-62  ot  BMUP  stepwise  Kit 
Usiny  all  Data  58-60  with  Shrinkage 


K 

- 1 

c 

. 

O' 

U 

75 

RF 

.4 

transitions 

s 

K 

KC 

S 

K 

KC 

S 

K 

KC 

s 

K 

KC 

0*0 

91 

4 

.96 

93 

2 

.  98 

93 

2 

.98 

95 

0 

1.0 

0-1 

5 

17 

.23 

3 

19 

.14 

3 

19 

.14 

1 

21 

.05 

1-0 

16 

6 

.73 

17 

5 

.77 

17 

5 

.77 

17 

5 

.77 

1-1 

30 

13 

.70 

26 

17 

.60 

26 

17 

.60 

22 

21 

.51 

Validation  on  1961  of  BMDP  Stepwise  Fit 
Using  all  Data  58-60  with  Shrinkage 


K - 

_ 1 _ 

0 

.6 

~ — 1 

0. 

5 

"  ‘~1 

0. 

4 

transitions 

s 

K 

FC 

S 

K 

FC 

S 

F 

FC 

S 

H 

1 

0-0 

53 

3 

.95 

54 

2 

.96 

54 

2 

.96 

56 

D 

0-1 

3 

8 

.27 

3 

y 

.27 

3 

8 

.27 

1 

fl 

1  +  0 

8 

3 

.73 

9 

2 

.82 

9 

2 

.82 

9 

B 

1-1 

9 

4 

.69 

8 

5 

.62 

8 

5 

.62 

7 

B 

FC  =  fraction  correct  predictions 
S  =  number  of  successful  predictions 
F  *  number  of  unsuccessful  predictions 
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Table  5 

Validation  on  1961  of  Predictions  with  Shrinkage 
of  Models  Fit  with  MLE  Using  all  Data  from  1958-60 


Shrinkage  _ 

Parameter  Cutoff 


Model 

0.5 


transitionsT  8 


0  +  0 
0  +  1 
1  +  0 


transitions 


0 

0 

0 

-► 

1 

1 

-► 

0 

0 

-► 

0 

0 

1 

1 

-► 

0 

54 

2 

3 

8 

9 

2 

8 

5 

transitions 


0.41 


Model 


C 


71  11  13  .46 

81  10  57  .85 


S 


.46  14  10  .58 

.94  7  bU  .89 


S 


15 

9 

11 

5b 

16  .33 
63  .94 


5 

0  I  0 


transitions 


0  +  0 
0+1 
1  +  0 
1  +  1 


10  .09  3 

0  1  7 

9  .31  11 


3  . 

8  .  21 

4  .  64 
2  .  85 


21  2 
64  10 


\um 


.85 
59  I  .88 


FC  =  fraction  correct  predictions 
,  _  Number  of  Days  of  stratus  in  1958-1960 
Total  Number  ot  Days  in  1958-1960 

Explanatory  variables  in  model  A  =  constant,  y  in(At) 

Explanatory  variables  in  Model  AW  =  constant,  y.  ,,  &n(A.) 


Table  7 


n-j  w  ~j  y|  | O'  H<7 


Table  9 


Threat  Scores  for  1961  Validation 
of  Models  fit  with  MLE  on 
data  from  1958-1960 


Model 

1 _ A _ i 

1 _ AW _ 1 

B  I 

BW 

Cutpoint 

0.5 

0.41 

0.5 

0.41 

0.5 

0.41 

0.5 

0.41 

T 

0 

0.21 

0.26 

0.18 

0.26 

0.21 

0.21 

0.20 

0.26 

K  =  1 

T1 

0.54 

0.50 

0.44 

0.58 

0.57 

0.58 

0.50 

0.54 

TT 

0.37 

0.35 

0.30 

0.39 

0.39 

0.35 

0.34 

0.38 

T 

0 

0.23 

0.21 

0.23 

0.28 

0.23 

0.19 

0.23 

0.17 

K  =  0.6 

Tx 

0.56 

0.54 

0.53 

0.50 

0.56 

0.58 

0.56 

0.54 

TT 

0.41 

0.37 

0.40 

0.38 

0.41 

0.3b 

0.41 

0.32 

T 

0 

0.23 

0.21 

0.23 

0.22 

0.25 

0.21 

0.25 

0.17 

K  =  0.5 

Ti 

0.47 

0.54 

0.47 

0.44 

0.56 

0.50 

0.53 

0.50 

TT 

0.38 

0.37 

0.38 

0.32 

0.43 

0.36 

0.41 

0.31 

T0 

0.09 

0.21 

0.18 

0.20 

0.09 

0.21 

0.09 

0.20 

3 

II 

* 

*! 

0.55 

0.54 

0.50 

0.44 

0.50 

0.57 

0.47 

0.47 

TT 

0.39 

0.37 

0.39 

0.32 

0.34 

0.39 

0.33 

0.33 

Model 


Explanatory  Variables 


A 

constant 

Hn(Afc ) 

*t-l 

AW 

constant 

*n(At ) 

*t-l 

V 

X 

V 

y 

B 

constant 

*n(At ) 

NSt 

NNSfc 

Ht-i 

BW- 

constant 

*n(At) 

NSt 

NNSt 

Ht-1  Vx 

No,  days  of  stratus  during  1958-60 
No,  days  during  1958-60 
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fsiij 


2 

M 

9 


Cutpoint 


=  0.6  T. 


=  0.5  T, 


0.50 

0.37 

0.17 

0.21 

0.47 

0.40 

0.34 

0.31 

0.50  0.37 


0 

0.35 

0.24 


0  0.17 
0.35  0.47 
0.24  0.34 


0  0.17 
0.32  0.47 
0.24  0.34 


0.50 

0.37 

0.17 

0.21 

0.40 

0.35 

0.31 

0.29 

0 

0.17 

■ 

KK9I 

m 

0 

0.17 

0.33 

0.32 

0.23 

0.26 

0 

0.17 

0.26 

0.37 

0.19 

0.29 

.la 

.40 

.32 


0 

.35 
.24 


0 

0.31 

0.22 


0  0.17 
0.24  0.40 
0.18  0.31 


Model 


0.37  = 


_ Explanatory  Variables 

constant  £n(Afc)  yt-1 

constant  £n(At)  yt-1  V* 

constant  £n(At)  NSfc  NNbfc  Ht-1 

constant  £n(At>  NSfc  NNS ^  Ht-1 

No.  days  OH  stratus  during  1958-61 
No.  days  during  195B-61 


APPENDIX  C 


Robust  Estimation  for  Binary  Logistic  Regression. 


Maximum  likelihood  estimates  are  susceptible  to  outlying 


data  points:  they  are  unduly  influenced  by  a  few  (exceptional) 


data  points  which  may  not  agree  with  the  assumed  model.  Pregibon 


(1982)  suggests  robust  procedures  which  yield  estimates  that  are 


resistant  to  a  few  such  exceptional  data  points.  The  procedure 


that  has  been  used  in  this  report  is  as  follows. 


Let  the  deviance  of  point  i  be 


d^  =  -2  (yi  An  pt  +  (1-y^  An(l-pi>),  i  = 


=  1, . . .  ,N 


(C-l) 


where  ir  the  logistic  model 


Pi  = 


expIx^B  > 

1  +  expfx^B} 


(C-2) 


XjS  =  e0  ♦  SjXjj  +  e2*i2  *  ...  ♦  epxip  ; 


(C-3) 


x^k  is  the  value  of  the  k—  explanatory  variable  for  the 


i—  data  point,  and  8^  is  the  estimate  of  6k  ,  the  regression 


coefficient  for  the  k—  explanatory  variable,  x.  ;  k  =  l,...,p. 


The  problem  of  finding  the  MLE  estimators  turns  out  to 


be  to  solve  for  8y,...,§^  in  the  non-linear  equations 


i^1  Xik  yi  ~  Xi£ 

1  1  +  e 


(C-4) 


for  k  *  1 , . . . ,p. 


P 


** .  ■»*.  •* ,  «* « •*,  •*,  •*.  .  **.  •' ,  «*,%*  *  ,  •'»  «*,  **,  •*_  •*.  **,  •*.  • 


One  possible  robust-resistant  (insensitive  to  outliers) 


procedure  is  to  find  estimators  fjg,...,p  such  that 


where 


l  w(i)xik  yi - TT 

i  =  l  i  + 


if  d.  <  H 
l  — 


w(  i  )  = 


(H/d.  ) 


otherwise , 


( C-5 ) 


(C-6  ) 


d^  is  the  deviance  of  the  i—  data  point  and  the  fitted  model 
at  that  point,  from  (C-l). 

A  value  of  H  =  1.35  was  suggested  by  Pregibon  and  used 
for  the  tuning  constant;  if  H  =  »  the  procedure  carries  out 
the  ordinary  MLF;  fitting,  while  as  H  decreases  the  effects  of 
extreme  local  deviance  points  have  progressively  less  effect  on  the 
fitted  model.  Notice  that  the  i—  data-determined  weight,  w(i), 
is  made  relatively  small  if  d(i)  is  large.  Thus  data  points 
which  are  not  well  fit  by  the  assumed  model  will  tend  to  receive 
less  weight  than  others  that  are.  The  resistant  estimates,  £  , 
are  found  by  iteration.  First  the  MLE  estimate  is  found  and  the 
initial  weights  computed.  Then  (C-5)  is  solved  for  (0(1), 

K 

k  =  l,...,p>  by  a  Newton-Raphson  procedure.  New  weights  w  (1) 
are  computed  from  (C-6).  Then  these  are  entered  in  (C-5),  and  it 


is  solved  for  {(3^(2),  k  =  l,...,p};  this  process  repeats  until 
the  iterative  estimates  converge. 


On  each  day  either  stratus  occurs  or  not.  If  stratus 
occurs  on  consecutive  days  then  a  run  of  stratus  days  is  said 
to  occur.  Let  NSfc  be  the  length  of  the  run  of  stratus  days 
that  includes  day  t-1.  For  example,  NSfc  =0  if  the  previous 
day  had  no  stratus,  so  y  ^  =  0;  while 

NSfc  =  2  if  y  J  =  1,  y t_2  =  1,  yfc_3  =  0  .  Let  NNSfc  be  the 
length  of  the  last  run  of  no  stratus  days  that  includes  day  t-1. 

Table  (11)  gives  the  estimates  for  five  iterations  of  the 
robust  procedure  applied  to  a  model  using  1958-1960  data.  The 
explanatory  variables  are:  constant,  NSt,  NNSfc,  Ht-1'  An(Afc). 
where  A^  is  the  dew  point  depression  plus  1. 

TABLE  11 

Results  of  Iteration  of  Resistant  Procedure 


Number  of 
Iteration 

Constant 

NSfc 

NNSfc 

Ht-1 

£n ( A t ) 

0  (MLE ) 

6.81 

-0.01 

-0.05 

0.21 

1 

U> 

• 

u> 

1 

9.30 

-0.04 

-0.05 

0.28 

-4.55 

2 

9.98 

-0.05 

-0.04 

0.30 

-4.88 

3 

10.16 

-0.06 

-0.04 

0.31 

-4 . 97 

4 

10.21 

-0.06 

-0.04 

0.31 

-5.00 

5 

10.22 

-0.06 

-0.04 

0.31 

-5.00 

Note  that 

except  for 

the  estimated 

value  of 

NNSt 

resistant  procedure  has  made  the  estimates  greater  in  absolute 
value.  Such  sharpening  of  the  expression  is  a  common  occurrence 
when  robust  logistic  procedures  are  utilized. 
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We  fit  this  model  B  robustly  to  19bb-6U  data  and  then  used 


the  fitted  model  to  predict  the  occurrence  of  stratus  with  a  cutoff 
point  of  U.5.  We  also  robustly  fit  model  B  to  195b-61  data  and 
used  it  to  predict  the  occurrence  of  stratus  in  1962.  Although  the 
estimated  parameters  using  the  robust  procedure  were  different,  the 
results  of  the  cross-validation  were  almost  the  same  as  with  the 
maximum  likelihood  fit  reported  in  Appendix  B.  Results  of  the 
cross-validations  with  models  fit  robustly  appear  in  Table  lb  at 
the  end  of  Appendix  L). 
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APPENDIX  D 


Logistic  Models  with  Updating 

Despite  best  attempts  to  develop  a  single  model  with  which 
to  predict  stratus  in  any  yiven  year,  the  resulting  model  may 
sufter  trom  lack  of  timeliness.  The  basic  reason  is  that  simple 
models  titted  with  data  trom  one  period  may  well  not  be  entirely 
relevant  to  another,  owing  to  changing  conditions  not  represented 
in  the  model.  One  attractive  procedure  tor  dealing  with  the  lack 
ot  timeliness  issue  is  to  progressively  update  the  model  fit  so 
as  to  incorporate  recent  data,  i.e.  data  representing  conditions 
near  in  time  to  those  to  be  torecast.  This  is  the  philosophy  ot 
the  well-known  Kalman  filter.  In  the  present  context  the  updating 
procedure  has  been  carried  out  completely  straightforwardly,  i.e. 
by  simply  re-computing  estimates  using  recent  data.  Computation¬ 
ally  economical  and  sophisticated  methods  remain  to  be  developed. 

we  report  the  results  ot  an  investigation  ot  updated  model 
tits  to  predict  the  occurrence  ot  stratus.  Three  updating  schemes 
were  tried. 

I.  a  model  was  initially  tit  usiny  all  data  trom  the  previous 
year.  Then  a  forecast  of  the  occurrence  ot  stratus  was  made  using 
the  model  for  the  first  ten  days  ot  the  current  (torecast)  year. 
These  ten  days  were  then  added  to  the  forecasting  data  set,  and 
the  eldest,  or  initial,  ten  days  ot  data  were  dropped.  The  model 
was  re-tit  usiny  the  updated  data.  Using  the  new  model,  the  oc¬ 
currence  ot  stratus  the  next  ten  days  of  the  current  year  was 
forecast.  Then  the  second-eldest  ten-days-worth  of  data  were 
dropped,  and  the  newest  ten  days  were  added,  and  the  model  was 


re-fit,  forecasts  made,  and  so  the  process  was  continued.  This 
may  be  referred  to  as  a  90-day  rolling  forecast  in  steps  of  ten 
days . 

II.  A  model  was  initially  fit  using  all  data  from  the  previous 
year.  A  forecast  for  the  occurrence  of  stratus  was  made  for  the 
first  day  of  the  current  year.  This  data  point  was  added  to  the 
forecasting  data  set,  and  the  eldest  point  deleted.  The  model  was 
refit  using  the  altered  modeling  data  set.  A  forecast  of  the  occur 
rence  of  stratus  was  made  for  the  next  day  of  the  current  year. 

This  data  point  was  added  to  the  modeling  data  set  and  the  oldest 
point  was  dropped,  and  so  forth.  This  is  a  rolling  forecast  in 
one-day  steps. 

III.  Same  as  II  but  the  initial  modeling  data  set  includes  only 
the  last  45  points  of  the  previous  year. 

Two  different  sets  of  explanatory  variables  were  tried, 


and 

B  with  anc 

1  without  wind  speeds,  where 

A: 

constant , 

yt-l'  *n(At)* 

AW: 

constant , 

yt_1,  An  ( A  t )  ,  V^t),  Vy(t) 

B: 

constant , 

NS,  NNS  ,  H  x,  An(A  > 

BW: 

constant , 

NSt,  NNSfc,  Hfc_1 ,  An  (  A  t )  ,  V^t), 

as  before;  Afc  is  the  dew  point  depression  plus  1. 

A  prediction  of  stratus  was  made  if  the  forecasted  proba¬ 
bility  was  greater  than  a  .  In  most  cases  a  =  0.5.  Additionally 
a  was  sometimes  taken  to  be  the  fraction  of  the  number  of  days 
of  stratus  over  all  years  previous  to  the  current  year. 

The  results  are  summarized  in  Tables  12-14  of  threat  scores 
( r  T  i  r  TT  )  and  fraction  of  correct  predictions  (  FC  ) .  For  com¬ 
parison  purposes  results  are  also  given  for  prediction  without 
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updating.  full  tables  ot  the  numbers  ol  correct  and  incorrect 
predictions  can  be  tound  in  Tables  lb-la. 

As  stated  previously,  the  cutoff  point,  a  ,  for  the  up¬ 
dating  procedures  was  either  U.b,  or  alternatively,  the  historical 
fraction  of  days  of  stratus.  tor  the  simpler  model  A,  the  use  ot 
the  historical  traction  appeared  to  improve  prediction  ot  stratus, 
but  to  worsen  the  prediction  of  no  stratus,  using  robust  estimates 
in  updating  procedure  I  gave  the  same  results  as  using  the  simpler 
MLt  estimates.  The  more  complicated  model  B  often  (but  not  always) 
improved  predictions  ot  changes.  Adding  information  about  winds  to 
either  model  A  or  b  never  improved  prediction  much,  using  shrinkage 
with  the  updating  procedure  11  once  again  tended  to  improve  pre¬ 
diction  of  changes  from  stratus  to  no  stratus,  but  tended  to  worsen 
prediction  ot  a  change  from  no  stratus  to  stratus.  Updating 
procedure  Ill  often  seemed  to  do  better  in  predicting  changes  from 
no  stratus  to  stratus  than  updating  procedure  II;  however,  it  did 
worse  when  predicting  changes  from  stratus  to  no  stratus.  Updating 
procedure  I  always  did  at  least  as  well  as  in  predicting  changes 
from  stratus  to  no  stratus  but  sometimes  not  as  well  as  III  in 
predicting  changes  from  no  stratus  to  stratus.  Model  u  with  an 
updating  procedure  often  did  better  than  Model  A  with  updating 
particularly  in  predicting  changes  from  no  stratus  to  stratus.  In 
summary,  models  with  updating  sometimes  did  better  than  models  with 
no  updating,  but  the  improvement  was  surprisingly  small. 


Threat  Scores  tor  Changes  and  Fraction  of  Predictions 
Correct  for  iy61  Predictions 

Based  on  Models  With  and  Without  Updating 

Model  A 


Data  Used 

to  Fit  Model  1958-1960 


1960 


1958-1960 


Data  Used 


to  Fit  Model 


NO 

MLE 

Robust 

0.5 

0.41 

0.5 

I960 


NO 


.21 

.58 

0.3910.35 


0.45 


MLE 

Robust 

0.5 

0.5 

0.43 


1958-1960 


0.5  10.41 


0.20  0.26 


0.50  0.54 


0.34  0.38 


0.79  0.78 


0.8110.78!  0.81  |  ! 0.80!  | 0.84 |  0.84  !  | 0 . 81 !  | 0 . 77 
Model  A  has  explanatory  variables:  constant,  y  *n(At) 

Model  B  has  explanatory  variables:  constant,  NSt,  NNSfc,  tn(At> 


Fraction  correct  using  persistence  is  0.76 


Table  13 


Threat  Scores  for  changes  and  traction  of  predictions  correct  tor  ly62 

Predictions 


Model  A 


Data  Used 


to  Pit  Model 


UDdatin 


Method 


1958-1961 


NO 


MLE 


1961 


NO 


MLE 


1958—1961 


0.5  0.37 


0.1710.21 


0 .47  I U .40 


0.34 10.31 


U. 79  0.78 


MLE 

0.5  I 

0.37 

0.31 


0.15 


0.74 


U . 5  I  0 . 37 


0.31 


0.15 


0.1710.23 


U. 4410. 40 


0.76  0.75 


0 .33 10 .32 


0.7b  0.7y 


Model  a 


Data  Used 


to  Pit  Model 


Uudatin 


1950-1961 


NO 

MLE  | 

Robust 

0.5 

0.37 

U.5 

1961 


NO 


0.17  0.21 


0.4010.40 


Robust  I  I MLE  I  I MLE 


0.5  0.5 


195b-iy 61 


0.5  10.37 


.17  0.15 

.29  0.37 


0.0b  0.20 


0.26  0.22 


0.31  0.31  0.2b  0.24  0.2b|  0.2b  I  I  0 . 19  I  I  0 . 21 1  I  0 . 32  I  0 . 30 


PC  I j  0 . 76 [ 0 . 76 |  0.75  | J 0 . 73 | | 0 . 74 j  0.74  | |0.71f |0.71( (0.77(0.77 

Model  A  has  explanatory  variables:  constant,  y  £n  (At) 

Model  ti  has  explanatory  variables:  constant,  NSt,  NNSt,  H  Jin  ( ) 


Fraction  correct  using  persistence  is  0.7b 
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The  fraction  of  correct  predictions  of  daily  stratus  or  no 
stratus  nede  during  both  years  using  only  persistence  is  0 


Table  lb 


Validations  tor  Rolling  tits 
(10  days  at  a  time  initiating  with  only  the 
previous  year) 


Model 

(Fitting  Procedure) 
Cutoff 


A 

(MLE ) 

I  a 


U.b2 


B 

(MLE  ) 
U.b 


( Kobust ) 
U.b 


Stratus 
no  Stratus 

A\F 

1 

U 

u 

U 

transitions 

0 

1 

1 

1 

-*•  U 

*  1 

19by-6U 


U 

U 

u 

U 

1 

1 

-► 

0 

1 

1 

U  -*•  0 

u  1 
1  +  u 


number  of  days  of  stratus  during  all  previous  years 
number  of  days  in  all  previous  years 

Model  A  explanatory  variables:  constant,  y  £n(At> 

Model  B  explanatory  variables:  constant,  Nt>t,  NNt>t,  ht_^,  £n(Afc) 

i 

i 


Table  lb 

One  year  Validations  tor  Updating  MLt;  fits 
of  models  for  one  day  ahead  and  dropping 
the  oldest  day  {cutoff  =  O.b) 


Model 


Initiating 
Data  Set 


u  ->•  u 

0  -►  i 

1  U 
1  -*•  1 


7  .36  4 

2  .93  26 


7.36  b 
4  .67  22 


7323 


t:  entire  previous  year  used  to  tit  initial  model 

H:  half  previous  year  used  to  tit  initial  model 

Model  A  explanatory  variables:  constant,  yfc_^,  £n(A  ) 

Model  tt  explanatory  variables:  constant,  NSt,  NNS^ ,  h  lr>(ht) 

t'C  =  traction  correct  predictions 


-•  v.v.v.v 


> 


Table  17 

Validation  of  Updating  of  Model  B  with  Shrinkage. 

The  Model  was  initially  fit  with  entire  previous 
year  and  one  point  from  new  year  added  and  oldest 
point  dropped  in  each  iteration. 


1960-61 


0  +  0 
0+1 
1  +  0 
1  +  1 


transitions 


0  +  0 
U  +  1 
1  +  0 
1  +  1 


I _ 1 _ 

1 

0 

FC 

12 

12 

.50 

5 

62 

.93 

S 

F 

53 

3 

.95 

4 

7 

.36 

9 

2 

.82 

8 

5 

.62 

wm 

■ 

23 

18 

.56 

8 

42 

.84 

14  .42  8 

63  .94  2 

F _ S_ 

2  .96  55 
8  .27  2 

2  .82  10 
.54 


22  .46 
45  .90 


16  .33 
65  .97 


33  5 

97  1 
S 

98  56 
18  1 
91  10 
46 


22  .46  15  26  .37 
46  .92  4  46  .92 


S 


FC  =  fraction  correct 


Model  B  explanatory  variables:  constant,  NSfc,  NNSt,  Jtn  ( A  t ) 

Cutoff  =  0.5  . 


Validations  tor  MLE  tits  without  updating  based 
on  difterent  amounts  ot  historical  data 


Validation 

Historical 

Yr. 

data 

1 

1960 

961 

I  1950-60 

i 

1961 

962 

1950-61 

A\F 

1 

0 

FC 

1 

0 

FC 

1 

0 

FC 

1 

0 

FC 

Model 

1 

13 

ll 

.54 

14 

io 

.58 

24 

17 

.59 

26 

~Tb 

.63 

A 

0 

10 

57 

.05 

7 

60 

.90 

3 

47 

.94 

4 

46 

.92 

transitions 

S 

F 

S 

F 

S 

F 

S 

F 

0+0 

40 

0 

.06 

53 

3 

.95 

39 

o 

1 

38 

T 

.91 

0+1 

5 

6 

.45 

3 

8 

.27 

0 

11 

0 

2 

9 

.10 

1  +  0 

9 

2 

.02 

7 

4 

.64 

0 

3 

.73 

8 

3 

.73 

1+1 

0 

5 

62 

11 

2 

85 

24 

6 

.00 

24 

6 

.80 

A\F 

n 

0 

FC 

1 

H 

FC 

H 

FC 

Model 

1 

10 

n=g 

UJ 

|U 

H-H 

1 

wvm 

■El 

.54 

.56 

B 

0 

59 

.00 

O 

EH 

Rtf 

1 

HI 

.80 

ta 

46 

.92 

transitions 

S 

F 

s 

F 

s 

v 

b 

F 

0+0 

50 

6 

.09 

3 

.95 

1 

m 

.97 

.97 

0+1 

6 

5 

.55 

3 

0 

.27 

I 

m 

.10 

SB 

1  +  0 

9 

2 

.02 

0 

3 

KB 

1 

mlM 

.55 

ms 

1  +  1 

0 

5 

62 

10 

3 

-i22j 

1 

pm 

10 

.67 

.70 

Cutoff  point  *  0.5  . 

Model  B  fit  robustly  to  data  1950-60  and  cross-validated  on  1961  yives 
the  same  results  as  MLE. 

Model  B  fit  robustly  to  data  1950-61  and  cross-validated  on  1962  yives 
the  same  results  as  MLE  except  in  the  cases  *  and  +  ;  tor  *  tne  corres¬ 
ponding  numbers  ace  5  and  45;  for  +  the  corresponding  numbers  are  7  and  4 


APPENDIX  E 


Survival  Models;  Relation  to  the  Logistic  Representation. 

E. 1  Preliminary  Models 

Suppose  a  system  occupies  one  of  two  states  for  a  varying 
("random")  time  period,  then  switches  to  the  other,  and  back. 
Such  events  occur  at  times  t  *  0,1,2,3,....  Such  is  the  case 
with  the  stratus-no  stratus  fluctuation  that  has  been  studied, 
but  is  also  true  of  many  other  weather-related  events,  rainfall- 
no  rainfall  being  a  prime  example. 


We  discuss  several  traditional  stochastic  models  as  a 
preliminary. 

Model  1:  Markov  Chain 

Let  denote  the  state  variable  of  the  system  at  time 


t.  Suppose  (here  i,j  =  0,1) 


-  pi3  > 0  > 


(E-l) 


in  particular,  no  further  past  history  is  useful: 

P{^tssj|Yt_1=i,Yt_2=a,Yt_3=b,...,Yt_£=k...}  =  p  (i 

for  all  i,j  and  all  t  . 

There  is  then  a  long-run  or  steady-state  distribution 
{ ir 0 , ir i >  that  satisfies  balance  equations: 


(E-2) 


*opoi  *  *ipio 


nlp10  =  *1“1I0^P10 


( E-3 ) 


v  *  P10  =  P01 

®  PlO  +  Pqi  '  *  Pin  +  Pf 


10  ^01 
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If  such  a  model  truly  described  nature,  i.e.  stratus  level  at 


an  airport,  then  it  ^  could  be  referred  to  as  the  climatological 
probability  of  stratus,  (Yt=l),  on  a  day.  Such  a  model  may  be 
fitted  to  data:  one  simply  estimates  p^,  for  example,  by 
the  fraction  of  changes  from  1  to  0  (stratus  to  no-stratus) 
observed  in  an  observational  period.  The  model  does  not  have 
the  capacity  to  incorporate  physical  parameters  or  explanatory 
variables,  such  as  dewpoint  depression. 

Model  2:  Two-State  Renewal  Process 

Let  S  represent  the  generic  length  of  a  stratus  period, 
i.e.  or  number  of  days  throughout  which  there  is  uninterrupted 
stratus  (Yfc=l).  Just  before  S  ,  and  just  after,  there  will  be 
periods  of  one  or  more  no-stratus  days;  let  such  a  generic  pe¬ 
riod  be  C-  (C  denotes  "clear");  (throughout  the  period 
Yfc  =  0).  If  is  a  sequence  of  statistically  indpendent 

stratus  periods  from  the  same  distribution,  and  {C^}  is  a 
collection  of  corresponding  clear  periods,  then  the  time  history 
of  system  state  appears  as  below: 


V.  t 


1 


•  •  •  • 


<- -  - -*•«- -  - -*•  — ♦  «■ - C3 
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VA. 


•  V-/ 

- 


.v.-- 


In  the  long  run. 


lim  P{Y  =1} 
t+»  c 


_ Mean  Length  of  Stratus  Period _ 

Mean  Length  of  Strat.  +  Mean  Length  of  No-Strat. 


(E-5) 


The  above  can  be  called  the  climatological  probability  of  stratus 
on  a  day.  Strictly,  the  two-state  renewal  process  model  stipulates 
that  the  sequence  of  stratus  day  periods  {S..}  is  one  in<3e“ 
pendent,  identically  distributed  random  variables,  as  is  the 
sequence  of  clear  day  periods  {C^};  the  two  sequences  are  mutu¬ 
ally  independent.  The  Markov  chain  model  is  a  special  case  of 
the  two-state  renewal  model  in  which  stratus  periods,  generically 
S  ,  have  a  geometric  distribution  with  mean  E[S]  ,  and  the  clear 
periods,  C  ,  have  their,  generally  different,  geometric  distri¬ 
bution  with  mean  E(C]. 

Once  again,  this  model  contains  no  direct  accounting  for 
the  possible  influence  of  explanatory  variables  upon  the  proba¬ 
bilities  of  stratus  state  changes. 


E . 2  The  General  Survival  Model 

Suppose  a  forecaster  is  in  action  at  time  t  .  He  easily 
notes  the  current  system  state;  suppose  Y  *  0  ,  i.e.  no 
stratus.  He  wishes  to  predict  the  system  state  at  t  +  1.  A 
believer  in  Model  2  will  act  in  an  actuarial  fashion,  computing 
the  conditional  probability  that  the  same  state  will  prevail 
("survival"  occurs),  given  that  the  current  clear  state  has 
lasted  for  d  days: 


( E-6 ) 


the  quantities  h0(d),  h^fd)  may  be  referred  to  as  the  hazards 
associated  with  the  states  in  question,  for 
-h, (d+1 ) 

1  -  e  -  h^(d+l)  if  h-^d+l)  is  small 

is  the  conditional  probability,  or,  picturesquely,  hazard ,  that 
a  stratus  period  of  duration  ( "lifelength"  or  "age")  d  actually 
"dies",  or  changes  to  a  non-stratus  period  at  age  d+1. 

Similarly  when  a  non-stratus  period  is  in  progress,  the  change 
occurs  with  hazard  hQ(d+l). 

A  promising  enterprise  is  now  to  enhance  the  above  forecast 
of  survival,  or  death,  at  age  d+1  by  further  relevant  informa¬ 
tion  about  the  physical  environment  of  the  process.  Under  present 
circumstances ,  i.e.  when  forecasting  stratus,  one  might  well  use 


dewpoint  depression  as  well  as  previous  days  of  stratus  (or 

no-stratus).  Other  explanatory  variables  might  well  be  appropri¬ 
ate,  and  can  perhaps  be  identified  from  physical  arguments  aug¬ 
mented  by  graphical  or  other  exploratory  techniques. 

In  order  to  utilize  the  hazard  notion  in  a  regression 
context  it  is  convenient  to  put 

hQ  =  exp{xfc£0}  ( E-10  ) 

where  for  instance  the  vector  of  explanatory  variables  might  be 

xfc  =  (l,NSt,At,Hfc_1,t)  ( E-ll ) 

and 

£-0  =  (e01' **  *  ,e0p)  (E-12) 

is  the  required  system  of  constants.  A  form  such  as  (E-10)  can 
never  be  negative,  a  minimal  requirement.  Precisely  the  expres¬ 
sion  (E-10)  has  been  used  by  Cox  (1972)  for  describing  hazards. 
Actually  Cox's  hazard  is  written  as 

A  ( t  )exp{xj&)  (E-13) 

Suppose  observations  are  available  on  n  days:  these 
are  of  the  form 


(yt'xtl'xt2",,'xtp)  ' 
where,  as  was  mentioned  earlier,  possibly 

Xtl  =  xt2  =  xt3  =  Ht-1'  xt4  =  NSt^i,e*  #  days  of 


continuous  stratus) 


Note  that  interactions  and  transformations  can  directly  be 
included;  e.g.  simply  put  xfc5  =  xt3xt2  =  Ht_i  x  ln(Afc)  to 
represent  an  interaction  term. 

Now  the  likelihood  for  the  3^  vector  is 


L(i,,;^x)  =  n 

t=l 


-h  (x  )  y 
le  j 


-h  (x  >  1-y 
tll-e  u  t  J 


(t;-l4) 


taking  logs,  we  get 


A(£)  =  l  [yj-hytx,.)  +  (l-yt)£n[l-e  ]] 


=  -  l  [yt  exp[xt£0J  +  ( l-yt )  £n  ( l-exptx^.3^}  )  ]  (B-lb) 


and  this  can  be  maximized  by  choice  of  J3y  ,  a  non-linear 
optimization  task.  The  usual  approach  would  involve  differen¬ 
tiation  with  respect  to  6^^  ,  and  solving  the  resulting  non¬ 
linear  system  by  a  variation  of  the  Newton-Raphson  method. 
Package  programs  are  available  tor  such  a  task. 


^(m,^)  =  KVl^lV^’VrJ’-  ‘t-Irll’j-WJ'iSt*  St1 

-h  (m+1 ) 


=  e 


=  exp  [-Aj ( t )exp{xtjH ]  .  (E-16) 


Ordinarily  l^(t)  is  thought  of  as  a  deterministic  but 
unknown  function  of  t  ,  i.e.  time  since  start  of  the  process. 

In  an  application  to  stratus  forecasting,  and  to  other  weather 
phenomena,  it  may  be  desirable  to  allow  a  dependence  of  the  basic 
hazard  rate  upon  m  ,  the  duration  or  "age"  of  the  current  epi¬ 
sode  (stratus,  or  non-stratus  as  the  case  may  be):  X^(m).  This 
necessitates  a  specification,  either  parametric  or  non-parametric ; 
the  Cox  procedure  in  Cox  [1972]  was  to  estimate  X^(m)  non- 
parametrically . 

In  order  to  associate  the  Cox  model  explicitly  with  the 

logistic,  adopt  the  attitude  that  X . (m)  is  actually  random, 

3  - 

and  is  independently  distributed  from  period  to  period,  with  a 
distribution  characteristic  of  the  state.  In  such  a  case  we  can 
do  no  better  than  to  attempt  to  estimate  the  model 


pj(m,xt)  *  E(exp  [-X^  (m)exp{JCtJ3}  ]  )  , 


( E-17 ) 


$ 

3 


ft 

* 


where  the  expectation  operator  E(»)  is  over  the  distribution  of 

the  now-random  hazard.  To  be  quite  specific,  allow  x^.(m)  to 

have  the  Gamma  distribution  for  a.,Y.  >  0  , 

3  3 


P{ 


*  "oiy/(oiY> 

Xj(m)  <  x}  -  /  e  ^_I__lYjdy  , 


(E-18) 


where  and  -y^  characterize  the  hazard  variability  when 

state  j  is  in  effect.  Now  for  this  distribution  the  expectation 
is  explicitly  in  terms  of  the  Laplace  transform: 


This  is  the  probability  of  survival  in  state  j  for  one  more 
period  (no  change). 


Now  the  probability  of  a  change  is,  using  the  above 
randomizing  model, 

n  -1 

Y  • 


P{Yt+1*j|Yt=j,xt}  =  1  - 


1  *t& 
1  +  -  e  Z 


( E-20 ) 


and,  in  case  Yj  =  1  (mixing  by  an  exponential)  we  find 


HYt  +  1*jUt-j,Xt=xt)  =  - 1 


-1  — t— 
a  .  e 


-1  -t^ 

1  +  a.  e  Z 


( E-2 1 ) 


which  is  precisely  the  logistic  regression  model.  It  is  thus 
clear  that  the  logistic  regression  model  can  arise  from  a  plau¬ 
sible  stochastic  mechanism.  Note  that  the  derivation  presents 
an  alternative  to  the  simple  logistic  model  that  incorporates  one 
more  parameter,  thus  possibly  allowing  for  the  better  represen¬ 
tation  of  a  wider  range  of  binary  response  data  than  by  the 


E  .4  The  Cox  Survival  Model  with  Stable-Law  Random  Hazara. 

It  is  of  interest  to  investigate  other  ways  of  introducing 
auxiliary  randomness  into  the  Cox  proportional  hazard  survival 
model.  This  process  considered  here  represents  model  parameter 
fluctuation  from  day  to  day  (in  the  present  application)  tnat  is 
not  covered  by  the  simple  representation 


P{Yt+i=j  |  Y  =j  »X.t=xt>  =  exp[-  X  exp{xt£}]  ; 


(E-22) 


instead  the  form  of  the  randomized  model  is  obtained  by  inserting  a 
term  in  the  hazard: 

P{Yt+1=j  j  Yt=j,Xt=2it,et^  =  exPt”  Xet  exp{xt£}J  *  (k-^) 

Now  is  not  directly  observable  or  estimable  if,  as 

is  assumed,  only  one  observation  on  a  probability  depending  on 
each  e t  is  available.  Effectively  one  observes  the  marginal 
probability  of  Yfc  +  1  =  3  •  yiven  Yfc=j  and  values  of  the  explana¬ 
tory  variables  Xfc: 


P{Yt+1=j | Yt=j,Xt=xt)  =  E£  ( exp  [-Xe  t  exp{xt£}J) 


(E-24) 


Suppose  now  that  efc  obeys  a  positive  stable  law  distri¬ 
bution  (see  Peller  (I9b6),  p.  17U).  In  this  case  the  Laplace 
transform  of  et  is  always  the  form 


tie  t]  =  e_(os)Y  ,  U  <  y  <  1  . 


(E-2b) 


Unfortunately,  explicit  formulas  tor  the  density  of  e  are 
generally  not  available;  that  for  y  =  1/2  is  an  exception: 


*  ,  1,  .  1 

f  (x;o,7)  -  - - 777 

Et  2  ^(x/a)^ 


-o/2x 


( E— 26 ) 


It  follows  generally  and  directly  from  (E-^4)  and  (E-25) 
that  if  et  is  positive  stable  the  marginal  probability  of  one- 
day  survival  is 

P^t+l=J  I  Yt  =  J'-t=-t}  =  (exP[-Xet  exPtxt^}J) 

=  exp[-(Xa)Y  exp{xty£}]  ,  (E-27) 

once  again  exactly  a  Cox  model  (i.e.  of  the  form  (E-22))  but  now 
with  the  parameters 


=  (  Xa ) 


6*  =  Y3  . 


(E-^8) 


Thus  the  particular  Cox  model  discussed  is  completely  insensi¬ 
tive  to  the  type  of  hazard  randomization ^introduced  here.  Notice 
that  the  effects  of  the  explanatory  variablesNer  covariates, 
xt  ,  as  measured  by  the  magnitudes  of  their  coefficients 
'£  *  Y0.  r  Y  <  1)  r  becomes  progressively  smaller  as  d  ;  the 

latter  "shrinkage"  tendency  is  associated  with  greater  ara^L 
greater  "spread"  of  the  distribution  (here  "spread"  cai»Dt 

be  measured  by  variance,  for  the  latter  tails  to  exist).  It  A. 
follows  that  the  predictive  (in  terms  of  explanatory  variables )\ 
power  of  a  Cox  model  could  improve  by  reducing  any  tendency  , 
towards  hazard  randomization  of  the  type  exhibited,  if  such  is  ■ 
possible . 

Further  work  on  randomized  Cox  models  yielding  binary  t 


series  will  be  reported  elsewhere 


APPENDIX  F 


Spectral  Analysis  of  Hourly  Stratus  Levels  and  Dew-Point 
Depression  for  July-September  1958. 

The  data  for  the  height  of  the  stratus  level  are  hourly 
records,  in  units  of  hundreds  of  feet,  of  the  height  of  the 
stratus  layer.  There  are  2208  such  observations.  The  data  is 
integer  valued  with  a  minimum  of  three  and  a  maximum  of  999; 

1410  of  the  observations  are  999  which  denotes  the  category  of 
no  stratus  (infinite  height);  the  next  largest  observational 
value  is  888,  of  which  there  are  62;  all  the  rest  of  the  obser¬ 
vations  are  less  than  or  equal  to  180. 

Logarithms  of  the  stratus  heights  were  taken  to  reduce 
the  range  of  the  data.  Figure  (4)  shows  the  in  (normalized 
periodogram)  of  the  transformed  data;  (cf.  Cox  and  Lewis  (1966) 
pp.  99).  If  the  data  are  uncorrelated  and  stationary  then 
the  values  of  the  normalized  periodogram  will  appear  independent 
and  have  the  unit  exponential  distribution.  The  line  is  at  the 
95%  quantile  for  the  maximum  of  1104  independent  unit  exponentials 
The  largest  peak  occurs  at  91.  Other  peaks  occur  at  1  and  276. 

The  peak  at  91  suggests  that  a  24  hour  cycle  may  be  present;  the 
peak  at  276  suggests  an  eight  hour  cycle.  The  peaks  around  1  may 
be  attributable  to  the  dependence  of  the  data.  A  least  squares 
cyclic  fit  for  the  24  and  eight  hour  cycles  was  next  carried  out. 
The  residuals  from  the  fit  were  then  whitened,  using  an  AR2  proc¬ 
ess.  Figure  (5)  shows  the  log  (normalized  periodogram)  of  the  resi 
duals  following  the  cyclic  fit  and  AR2  whitening.  There  are  still 
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two  values  of  the  periodogram  above  the  quantile  line  at  91  and 
160.  Figure  (6)  exhibits  the  cumulative  periodogram  of  the 
residuals.  If  the  residuals  were  uncorrelated  and  stationary, 
then  the  cumulative  periodogram  would  have  the  same  distribu¬ 
tion  as  the  order  statistics  of  an  independent  sample  of  1104 
independent  uniform  random  variables.  The  Kolmogorov-Smirnov 
statistic  of  goodness  of  fit  is  1.12  (Theoretical  99%  quantile  is 
1.628)  and  the  Anderson-Darling  statistic  is  1.39  (theoretical  99% 
quantile  is  3.857). 

As  a  result  of  the  above,  we  model  the  logarithm  of 
hourly  stratus  heights  as 

in  L  =  (-1.55)sin(^||)  -  0.322  cos(-^J) 

(-0.202)sin(£jp)  -  0.300  cos(^)  +  Afc  ; 

Afc  =  0.750At_I  +  0 . 078Afc_2  +  E* 

0  . 

where  Efc  are  stationary  and  uncorrelated  random  variables. 

£ 

Figure  ( 7 )  shows  the  residuals  E  . 

A  similar  analysis  was  carried  out  on  Hn [dew  point 
depression  +  1]  ( LDPD ) .  The  data  range  from  0  to  9.21;  the  values 

have  a  discrete  nature,  but  not  as  noticeably  as  that  of  the 
stratus  levels.  The  Jln-per iodogram  of  LDPD  is  given  in 
Figure  8.  There  are  visible  peaks  at  92,  186  and  276,  as 
well  as  near  1.  The  peaks  at  92,  186,  and  276  suggest  24  hr, 

12  hr,  and  8  hr  cycles,  respectively.  A  least-squares  cyclic 
fit  was  made,  and  the  residuals  from  the  fit  were  once  again 
whitened  with  an  AR2  process. 
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Figure  9  gives  the  cumulative  periodogram  of  the  residuals 


with  the  Kolmogorov-Smirnov  and  Anderson-Darling  statistics.  Our 
model  for  LDPD  is 

LDPDt  *  0.015  sin(~|-)  +  0.019  cos(~) 

+  0.060  sin(-^|)  -  0.031  cos(^~) 

+  0.087  sin(-^p)  +  0.061  cos  (— jj— ■) 


.802  Bt_1  +  .092  Bfc_2  +  Efc  . 


A  graph  of  the  residuals  {£°}  is  presented  in  Figure  10, 
Note  the  two  large  residuals. 

z 

Next  the  residuals,  { Et }  of  the  Jtn  (stratus  height) 
level  were  regressed  on  {E^}  ,  the  residuals  of  tn  (dew  point 
depression)  using  a  least-squares  procedure  and  the  robust  bi¬ 
weight  procedure. 


=  0.0005  +  0.2712  E{ 
(.022)  (.076) 


(Least  Squares) 
(Standard  Errors) 


=  U.0073  +  0.084  E] 


(Biweight) 


If  the  two  points  corresponding  to  large  LDPD  residuals 
are  deleted  than  the  following  values  for  regression  coefficients 
are  obtained 


*  0.009  +  .3421  E^ 
(.022)  (.086) 


(Least  Squares) 
(Standard  Errors) 


d 

Et  = 


d 

U . 0083  +  0.1372  Efc 


A7 


(Biweight ) 


The  positive  slope  of  the  regression  of  the  residuals 
suggest  that  the  larger  the  dew  point  depression,  the  higher 
the  stratus  level.  Since  the  regression  was  performed  on  the 
residuals  of  both  series  after  detrending  and  whitening  the 
relationship  should  not  be  strongly  influenced  by  non-stationary 
and  dependence  effects  in  the  marginal  series. 

The  small  values  of  the  fitted  slopes  suggest  that  the 
relationship  is  present,  but  not  very  strong.  This  relationship 
together  with  the  box  plots  of  Figure  1  provide  evidence  that 
dewpoint  depression  and  the  existence  of  stratus  are  indeed 
related. 

The  residuals  were  also  examined  for  lagged  relationships, 

£  d 

eg.  a  relationship  between  and  .  No  relationships 

were  evident. 
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APPENDIX  G 


Threat  Scores 

In  this  section  we  discuss  the  asymptotic  distribution  of 


a  threat  score. 


Consider  an  event  that  either  does  or  does  not  occur  on 


day  n,  n  =  1, . . .  ,N 


Y  = 
n 


1  if  event  occurs  on  day  n  ; 


U  if  event  does  not  occur  on  day  n 


>9 


X  = 
n 


1  if  the  prediction  is  made  that  the  event 
occurs  on  day  n  ; 


0  if  the  prediction  is  made  that  the  event 
does  not  occur  on  day  n  . 


*  l  <1-*„>U-X„> 


be  the  number  of  correct  predictions  of  the  event  not  occurring; 


i,  =  I  Y  X  , 
1  n“l  n  n 


be  the  number  of  correct  predictions  ot  the  event  occurring; 
N 

Fn  =  I  (1-Y  )X  , 

0  n  n 

n  =  l 

be  the  number  of  incorrect  predictions  when  no  event  occurs; 


12 


t 


N 


=  nil  VW 


the  number  of  incorrect  predictions  when  the  event  occurs. 
The  threat  score  for  predicting  that  the  event  occurs  is 

S, 


T  = 


S1  +  Fu  +  Fi 


(G-l) 


Equations  (A-2) ,  (a-3),  and  (A-4)  give  threat  scores  for  predic¬ 
ting  changes  from  no  stratus  to  stratus,  changes  from  stratus 
to  no  stratus,  and  all  changes  respectively. 

Note  that 


T  * 


N 


1-S, 


where  ftx^x^)  = 


If  there  is  perfect  prediction,  then  S0+  Sj  =  N  and  T=i.  If  s^=U 
then  T=U.  In  the  case  of  predicting  changes  from  no  stratus  to 
stratus,  the  threat  score  would  be  0  if  prediction  of  stratus 
is  done  using  only  persistence. 

Assume  (Sy,  t'0,  Ej,  )  has  a  multinomial  distribution  with 
parameters  N,  yyu,  y^,  y^,  y-^.  Asymptotically  as 

/  ^l\ 

N  -*■  oo  l—'  — — '  —I  has  a  normal  distribution  with  mean 

’  \  N  N  N  N  / 

)  and  covariance  matrix  —  \  where  \  equals 


(y 


uu'  Tulr  'll'  T1U 
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UU'A  7UU> 

TUUYUl 

YUUTll  ~ 

TOUYiU 

«*«• 

UUYU1 

YU1(1"YU1) 

■  YU1Y11 

YUl»YlU 

UUY11 

-  YU1Y11 

Y11(1"Y11) 

-  Y11Y1U 

U0Y1U 

YU1Y10 

Y11Y10 

Yiu(i- 

(c£.  Bishop  et  al.  (1975)). 


A  Taylor  expansion  of  f  in  (G-2)  yields 


11  1  /f 0 

i_You  1_Yuu  'N  "  Y°° 


1 


Y11 

"YUU 


2 


Y 


11 


) 


+  o 


max 


Gr  -  Tuo)'(;r  -  '<ii) 


It  follows  from  an  application  of  the  multidimensional  &  - 


method  (cf.  Theorem  14.6-2  of  Bishop  et  al.)  that  as  N  *  »  ,  T 

Y1 1 

has  an  asymptotic  normal  distribution  with  mean  yr -  and 


uu 


variance  77  a ^  where 
N 


2 

a  =  y 


1  Y11  YUU 
11  /1 

^ 1_Y  UU 


If  Yyy  is  fixed,  then  0  has  a  maximum  at 

which  it  has  the  value  7- rT^ - r  . 

2(i“Yuu) 

At  y1x  =  u  and  y1x  =  1  -  Yuy  ,  aZ  =  u  . 


1— Y 


UU 


at 


74 


Another  application  of  the  6-method  shows  that  the  trans¬ 


formed  threat  score  arcsin/T  has  an  asymptotic  normal  distribu 

Y11 

tion  with  mean  arcsin/-* -  and  variance 


N  4  (1  -  Y  0q )  * 


Thus,  if  Yqq  is  fixed,  then  the  transformed  threat  score 
arcsin/T  has  a  variance  which  does  not  depend  on  .  How¬ 

ever,  both  the  threat  score,  T  ,  and  arcsin/T  can  have  large 
variance  if  y nn  is  close  to  1  (which  will  often  be  the  case). 
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